Previously, I implemented the bayesian version of the OC4 empirical chlorophyll algorithm, where the input consists of the log of the ratio of the maximum of three blue bands to a green band. Here I will focus on the OC4 algorithm, part of the OCx family of empirical algorithms, which are generally expressed as follows. $$ log_{10}\left(chlor_a\right) = a_0 + \sum_{i=1}^{4}a_ilog_{10}\left(\frac{max\left(Rrs\left(\lambda_{blue}\right)\right)}{Rrs\left(\lambda_{green}\right)}\right)^i$$

However, some diagnostics of the model results suggested that a $4^{th}$ order polynomial may be too complex. This post is about model comparison , which I illustrate by pitting last post's model against a simpler $3^{rd}$ order polynomial regression in an attempt. Some of the concepts I apply include posterior predictive checks on training as well as out-of-sample data, information criteria measures, and bayes factor.



In [1]:

    
import warnings
warnings.filterwarnings('ignore')
import numpy as np
import os
from scipy.stats import norm
import matplotlib.pyplot as pl
import pymc3 as pm
import pandas as pd
import seaborn as sb
import pickle
from sklearn.metrics import mean_squared_error, r2_score
from theano import shared

from IPython.core.display import display, HTML
display(HTML("<style>.container {width:90% !important;}</style>"))
%matplotlib inline



In [2]:

    
## HELPER FUNCTIONS
# ---------------------------------------------------

def FitPoly(X,y, order=4, lin=False):
    """
    Numpy regression. Returns coeffs.
    kwargs:
        lin: specifies whether data is log transformed, and log-transforms it if not. 
    """
    
    if lin:
        X = np.log10(X)
        y = np.log10(y)
    coeffs = np.polyfit(X,y,deg=order)
    return coeffs[::-1]

def GetPPC(data, mcChain, pmModel, varname):
    ppcTrain = pm.sample_ppc(mcChain, samples=1000, model=pmModel)
    """Compute posterior predictive check for 50% and 95% density interval"""
    
    idx = np.argsort(data)
    sig0 = pm.hpd(ppcTrain[varname], alpha=0.5)[idx] 
    sig1 = pm.hpd(ppcTrain[varname], alpha=0.05)[idx]
    return sig0, sig1

def PlotPPC(data, chlObs, sig0, sig1, bayesCoeffs, oc4V6Coeffs=[], myOLS_coeffs=[],
            title='PPC', ax=None, legendOn=True, xlblOn=True, ylblOn=True,
            groupIdx=None):
    """Plot posterior predictive check. """
    
    idx = np.argsort(data)
    data_ord = data[idx]
    dataMock = np.linspace(data.min(), data.max(), 100)
    chlBayes=0
    for i,coeff in enumerate(bayesCoeffs):
        if groupIdx is None:
            chlBayes += coeff * data_ord ** i
        else:
            chlBayes += coeff[groupIdx]
    
    if not ax:
        f,ax = pl.subplots(figsize=(16,10))
    ax.fill_between(data_ord, sig0[:,0], sig0[:,1], color='k', alpha=1, label='50% HPD.')
    ax.fill_between(data_ord, sig1[:,0], sig1[:,1], color='darkgray', alpha=0.5, label='95% HDP')
    ax.plot(data, chlObs, 'ko', alpha=0.5, markeredgecolor='k',  label='Observation')
    if len(oc4V6Coeffs)>0:
        oc4V6Modl=0
        for i, coef in enumerate(oc4V6Coeffs):
            oc4V6Modl += coef * dataMock**i
        ax.plot(dataMock, oc4V6Modl,'r', linewidth=3, linestyle='--', label='OC4v6')
    if len(myOLS_coeffs)>0:
        olsMdl=0
        for i,coeff in enumerate(myOLS_coeffs[::-1]):
            olsMdl+= coeff * dataMock**i
        pl.plot(dataMock, olsMdl,'m', linewidth=5, label='OLS')
    ax.plot(data_ord, chlBayes,'b', linewidth=2, label='Bayesian Regression Mean')
    if xlblOn:
        ax.set_xlabel(r'$log_{10}\frac{max(Rrs\left(\lambda_{blue}\right))}{Rrs\left(\lambda_{green}\right)}$', fontsize=20)
    if ylblOn:
        ax.set_ylabel(r'$log_{10}\left(chlor_a\right)$', fontsize=20);
    if legendOn:
        ax.legend(fontsize=14);
    ax.set_title(title);
    return pl.gca()



In [5]:

    
dfcomp_plot_leg_lbls=['in-sample deviance', 'best waic',
                      'waic','diff. to best model']
dfNomadV2 = pd.read_pickle('./bayesianChl_DATA/NOMADV2SWFClean.pkl')
dfNomadV2.columns









    Out[5]:





Index(['id', 'depth', 'rrs411', 'rrs443', 'rrs489', 'rrs510', 'rrs555',
       'rrs670', 'hplc_chl', 'fluo_chl', 'chl_all'],
      dtype='object')

Extracting the data needed into a new dataframe...



In [8]:

    
dfNomadV2['chl_type'] = np.where(dfNomadV2.hplc_chl.isnull(),'fluor','hplc')
dfNomadV2.chl_type = dfNomadV2.chl_type.astype('category')
blues = ['rrs443', 'rrs489', 'rrs510']
dfNomadV2['maxBlue'] = dfNomadV2[blues].max(axis=1)
dfNomadV2['maxBlueId'] = dfNomadV2[blues].idxmax(axis=1)
dfNomadV2['blue2green'] = dfNomadV2.maxBlue / dfNomadV2.rrs555
#  creating new dataframe and transferring relevant data and creating new columns
dfLogOCx = pd.DataFrame(columns=['id','mxBl2Gr', 'chl',])
dfLogOCx['id'] = dfNomadV2.id
# log-transform max blue-to-green ratio column
dfLogOCx['mxBl2Gr'] = np.log10(dfNomadV2.blue2green.values)
dfLogOCx['chl'] = np.log10(dfNomadV2.chl_all.values)
# Categorize and index chl types and blue used in max blue-to-green ratio
# chl_type codes; 0: fluo, 1: hplc
dfLogOCx['chl_id'] = dfNomadV2.chl_type
dfLogOCx['chl_cat'] = dfNomadV2.chl_type.cat.codes
# blue band used in b2g; 0: 412, 1: 443, 2: 490
dfNomadV2.maxBlueCat = dfNomadV2.maxBlueId.astype('category')
dfLogOCx['maxBlue_id'] = dfNomadV2.maxBlueId
dfLogOCx['maxBlue_cat'] = dfNomadV2.maxBlueCat.cat.codes

And let's inspect the data again



In [9]:

    
dfLogOCx.info(), dfLogOCx.describe()









    



<class 'pandas.core.frame.DataFrame'>
Int64Index: 1016 entries, 0 to 4458
Data columns (total 7 columns):
id             1016 non-null int32
mxBl2Gr        1016 non-null float64
chl            1016 non-null float64
chl_id         1016 non-null category
chl_cat        1016 non-null int8
maxBlue_id     1016 non-null object
maxBlue_cat    1016 non-null int8
dtypes: category(1), float64(2), int32(1), int8(2), object(1)
memory usage: 38.8+ KB






    Out[9]:





(None,                 id      mxBl2Gr          chl      chl_cat  maxBlue_cat
 count  1016.000000  1016.000000  1016.000000  1016.000000  1016.000000
 mean   4598.373031     0.117382     0.089204     0.410433     1.297244
 std    2318.827769     0.287944     0.702093     0.492155     0.802847
 min    1441.000000    -0.353183    -1.769551     0.000000     0.000000
 25%    2118.750000    -0.087723    -0.456596     0.000000     1.000000
 50%    3702.500000     0.021409     0.068000     0.000000     2.000000
 75%    7036.250000     0.255020     0.574263     1.000000     2.000000
 max    7831.000000     0.982648     1.891341     1.000000     2.000000)



In [10]:

    
dfLogOCx.head()









    Out[10]:







  
    
      
      id
      mxBl2Gr
      chl
      chl_id
      chl_cat
      maxBlue_id
      maxBlue_cat
    
  
  
    
      0
      1565
      -0.267867
      1.581950
      fluor
      0
      rrs510
      2
    
    
      1
      1566
      -0.220457
      1.544192
      fluor
      0
      rrs510
      2
    
    
      2
      1567
      -0.268537
      1.429914
      fluor
      0
      rrs510
      2
    
    
      3
      1568
      -0.279976
      1.680879
      fluor
      0
      rrs510
      2
    
    
      4
      1559
      -0.167733
      1.371991
      fluor
      0
      rrs510
      2

As a reminder, the data look like so...



In [11]:

    
sb.set(font_scale=1.5)
g = sb.PairGrid(dfLogOCx, vars=['chl', 'mxBl2Gr'],size=5, diag_sharey=False);
g = g.map_upper(pl.scatter,alpha=0.5)
g = g.map_diag(sb.kdeplot, lw=3)
g = g.map_lower(sb.kdeplot, cmap="Purples_d");

I'll compute OLS coefficients for polynomials of degree 3 and 4 for reference. Note that I define X_train, y_train here as well, this is in contrast to the testing data that I will later assign to X_test and y_test



In [12]:

    
X_train = dfLogOCx.mxBl2Gr.values
y_train = dfLogOCx.chl.values
ols3 = FitPoly(X_train, y_train, order=3, lin=False)
ols4 = FitPoly(X_train, y_train, order=4, lin=False)

Bayesian linear regression. For now, I am simply going to assume I'm still looking for a $4^{th}$ order polynomial. Model comparison will be the subject of a subsequent post. First I have to specify some priors around the coefficients. I'll assume weakly informative gaussian priors. Before deciding an appropriate distribution for the likelihood, I will check the normality of the data by way of a normality plot, as well as a comparison between the cummulative distribution function (CDF) of the data and a normal standard distribution.



In [13]:

    
normData = norm.rvs(y_train.mean() ,y_train.std(),
                    size=y_train.size)
normDataSort = np.sort(normData)
logChlSort = np.sort(y_train)
f,ax = pl.subplots(ncols=2,figsize=(13,6))
lbl = r'$log_{10}(chl)$'
ax[0].plot(normDataSort, logChlSort,'ko', alpha=0.5)
ax[0].set_xlabel('%s' %lbl, fontsize=16)
ax[0].set_ylabel('normal standard values');
ax[0].set_title('normality plot')
ax[1].plot(logChlSort, np.linspace(0,1,logChlSort.size), label='obs. data')
ax[1].plot(normDataSort,np.linspace(0,1,normDataSort.size),'r--', 
          label='normal standard values')
ax[1].set_ylabel('CDF')
ax[1].legend(loc='best');

The above suggests that a normal distribution might be a good candidate to apply to the likelihood. Next is to setup the bayesian model. I'll do the fourth order first and then third order. Note how the input data is cast as a theano shared variable. This is not necessary to run the model below. I'll revisit why I do this later when I start playing with the model.

The Kruschke diagram for both O(3) and O(4) models is shown below for illustration:



In [14]:

    
X_Shrd = shared(X_train)



In [76]:

    
with pm.Model() as OC4_ord4:
    a0_4 = pm.Normal('a0_4', mu=0, sd=10)
    a1_4 = pm.Normal('a1_4', mu=0, sd=10)
    a2_4 = pm.Normal('a2_4', mu=0, sd=10)
    a3_4 = pm.Normal('a3_4', mu=0, sd=10)
    a4_4 = pm.Normal('a4_4', mu=0, sd=10)
    epsilon_4 = pm.HalfNormal('epsilon_4', sd=5)
    mu_4 = a0_4 + a1_4 * X_Shrd + a2_4 * X_Shrd**2 + a3_4 * X_Shrd**3 + a4_4 * X_Shrd**4
    chlPred_4 = pm.Normal('chlPred_4', mu=mu_4, sd=epsilon_4, observed=y_train)
    start = pm.find_MAP()
    step = pm.NUTS(scaling=start)
    trace_ord4 = pm.sample(2000, tune=1000, step=step, start=start)









    



Optimization terminated successfully.
         Current function value: 203.237244
         Iterations: 17
         Function evaluations: 26
         Gradient evaluations: 26






    



100%|██████████| 3000/3000 [00:25<00:00, 115.66it/s]



In [77]:

    
with pm.Model() as OC4_ord4_reg:
    a0_4 = pm.Normal('a0_4', mu=0, sd=1)
    a1_4 = pm.Normal('a1_4', mu=0, sd=1)
    a2_4 = pm.Normal('a2_4', mu=0, sd=1)
    a3_4 = pm.Normal('a3_4', mu=0, sd=1)
    a4_4 = pm.Normal('a4_4', mu=0, sd=1)
    epsilon_4 = pm.HalfNormal('epsilon_4', sd=5)
    mu_4 = a0_4 + a1_4 * X_Shrd + a2_4 * X_Shrd**2 + a3_4 * X_Shrd**3 + a4_4 * X_Shrd**4
    chlPred_4 = pm.Normal('chlPred_4', mu=mu_4, sd=epsilon_4, observed=y_train)
    start = pm.find_MAP()
    step = pm.NUTS(scaling=start)
    trace_ord4_reg = pm.sample(2000, tune=1000, step=step, start=start)









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock






    



Optimization terminated successfully.
         Current function value: 202.179413
         Iterations: 18
         Function evaluations: 27
         Gradient evaluations: 27






    



100%|██████████| 3000/3000 [00:19<00:00, 155.91it/s]



In [78]:

    
with pm.Model() as OC4_ord3:
    a0_3 = pm.Normal('a0_3', mu=0, sd=10)
    a1_3 = pm.Normal('a1_3', mu=0, sd=10)
    a2_3 = pm.Normal('a2_3', mu=0, sd=10)
    a3_3 = pm.Normal('a3_3', mu=0, sd=10)
    epsilon_3 = pm.HalfNormal('epsilon_3', sd=5)
    mu_3 = a0_3 + a1_3 * X_Shrd + a2_3 * X_Shrd**2 + a3_3 * X_Shrd**3
    chlPred_3 = pm.Normal('chlPred_3', mu=mu_3, sd=epsilon_3, observed=y_train)
    start = pm.find_MAP()
    step = pm.NUTS(scaling=start)
    trace_ord3 = pm.sample(2000, tune=1000, step=step, start=start)









    



Optimization terminated successfully.
         Current function value: 201.375344
         Iterations: 16
         Function evaluations: 25
         Gradient evaluations: 25






    



100%|██████████| 3000/3000 [00:11<00:00, 265.94it/s]



In [79]:

    
with pm.Model() as OC4_ord3_reg:
    a0_3 = pm.Normal('a0_3', mu=0, sd=1)
    a1_3 = pm.Normal('a1_3', mu=0, sd=1)
    a2_3 = pm.Normal('a2_3', mu=0, sd=1)
    a3_3 = pm.Normal('a3_3', mu=0, sd=1)
    epsilon_3 = pm.HalfNormal('epsilon_3', 5)
    mu_3 = a0_3 + a1_3 * X_Shrd + a2_3 * X_Shrd**2 + a3_3 * X_Shrd**3
    chlPred_3_reg = pm.Normal('chlPred_3_reg', mu=mu_3, sd=epsilon_3, observed=y_train)
    start = pm.find_MAP()
    step = pm.NUTS(scaling=start)
    trace_ord3_reg = pm.sample(2000, tune=1000, step=step, start=start)









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock






    



Optimization terminated successfully.
         Current function value: 204.461262
         Iterations: 15
         Function evaluations: 24
         Gradient evaluations: 24






    



100%|██████████| 3000/3000 [00:10<00:00, 283.08it/s]

Parameter covariance



In [81]:

    
f, ax = pl.subplots(ncols=2, nrows=2, figsize=(12,12))
for axi, trace, nm in zip(ax.ravel(),
                      [trace_ord4, trace_ord4_reg, trace_ord3, trace_ord3_reg],
                      ['o4', 'o4-reg', 'o3', 'o3-reg']):
    varnms = [name for name in trace.varnames if name[0] == 'a']
    dftracecorr = pm.trace_to_dataframe(trace, varnames=varnms).corr() 
    sb.heatmap(dftracecorr, ax=axi, annot=True, annot_kws={'fontsize': 12},
               mask=np.triu(np.ones(dftracecorr.shape), k=0))
    axi.set_title(nm)



In [56]:

    
def PlotPosteriors(trace, alpha=0.05, refval=None):
    varnms = [nm for nm in trace.varnames if nm[0]=='a']
    axs = pm.plot_posterior(trace, varnames=varnms, ref_val=refval)
    return axs



In [82]:

    
axs = PlotPosteriors(trace_ord4, refval=0)
axs[0].set_xlim((0.2, 0.29));
axs[1].set_xlim((-3.5, -2.7))
axs[2].set_xlim((2, 4.2));



In [85]:

    
axs = PlotPosteriors(trace_ord4_reg, refval=0)
axs[0].set_xlim((0.20, 0.3))
axs[1].set_xlim((-3.3, -2.8))
axs[2].set_xlim((2, 4));



In [86]:

    
axs = PlotPosteriors(trace_ord3, refval=None)



In [88]:

    
axs = PlotPosteriors(trace_ord3_reg)
#axs[0].set_xlim((0.25, 0.35))
#axs[0]

Note how in the case of the order 4 polynomial regression, a3 and a4 straddle the '0' line with rather wide HDI's. This is not the case with any of the order 3 polynomial regression.

Model Comparisons:

The diagnostics I will use here are

posterior predictive checks on in-sample data
information criteria; including
- deviance information criterion
- widely available information criterion
posterior predictive checks on out-of-sample data

Posterior Predictive Checks on in-sample data



In [89]:

    
sig0_ord4, sig1_ord4 = GetPPC(data=X_train, mcChain=trace_ord4, pmModel=OC4_ord4, varname='chlPred_4')
bayesCoeffs_ord4 = [trace_ord4['a%d_4' %i].mean() for i in range(5)]









    



100%|██████████| 1000/1000 [00:00<00:00, 72.36it/s]



In [90]:

    
sig0_ord4_reg, sig1_ord4_reg = GetPPC(data=X_train, mcChain=trace_ord4_reg, pmModel=OC4_ord4_reg, varname='chlPred_4')
bayesCoeffs_ord4_reg = [trace_ord4_reg['a%d_4' %i].mean() for i in range(5)]









    



100%|██████████| 1000/1000 [00:00<00:00, 1167.64it/s]



In [91]:

    
sig0_ord3, sig1_ord3 = GetPPC(data=X_train, mcChain=trace_ord3, pmModel=OC4_ord3, varname='chlPred_3')
bayesCoeffs_ord3 = [trace_ord3['a%d_3' %i].mean() for i in range(4)]









    



100%|██████████| 1000/1000 [00:00<00:00, 1263.34it/s]



In [92]:

    
sig0_ord3_reg, sig1_ord3_reg = GetPPC(data=X_train, mcChain=trace_ord3_reg, pmModel=OC4_ord3_reg,
                                      varname='chlPred_3_reg')
bayesCoeffs_ord3_reg = [trace_ord3_reg['a%d_3' %i].mean() for i in range(4)]









    



100%|██████████| 1000/1000 [00:00<00:00, 1234.56it/s]



In [98]:

    
_,axs = pl.subplots(nrows=2,ncols=2, figsize=(16,13), sharex=True)
_ = PlotPPC(X_train, y_train, sig0_ord4, sig1_ord4, bayesCoeffs_ord4,ax=axs[0,0],
            title='PPC (Training Data) - Order 4', xlblOn=False, legendOn=False)
_ = PlotPPC(X_train, y_train, sig0_ord4_reg, sig1_ord4_reg, bayesCoeffs_ord4_reg,ax=axs[0,1],
            title='PPC (Training Data) - Order 4 w/Reg', xlblOn=False)
_ = PlotPPC(X_train, y_train, sig0_ord3, sig1_ord3, bayesCoeffs_ord3,ax=axs[1,0], 
            title='PPC (Training Data) - Order 3')
_ = PlotPPC(X_train, y_train, sig0_ord3_reg, sig1_ord3_reg, bayesCoeffs_ord3_reg,ax=axs[1,1],
            title='PPC (Training Data) - Order 3 w/Reg', legendOn=False)

Widely Available Information Criteria (DIC)

what it is
how it's calculated
what it tells us



In [99]:

    
modlNames = ['o4', 'o4_reg', 'o3', 'o3_reg']
dfComp = pm.compare([trace_ord4, trace_ord4, trace_ord3, trace_ord3_reg],
                   [OC4_ord4, OC4_ord4_reg, OC4_ord3, OC4_ord3_reg])
dfComp.index = [modlNames[i] for i in dfComp.index]
dfComp









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock






    Out[99]:







  
    
      
      WAIC
      pWAIC
      dWAIC
      weight
      SE
      dSE
      warning
    
  
  
    
      o4
      379.578
      5.50634
      0
      0.336696
      56.3988
      0
      0
    
    
      o4_reg
      379.578
      5.50634
      0
      0.336696
      56.3988
      0
      0
    
    
      o3
      380.289
      4.61867
      0.711159
      0.235946
      56.6251
      0.303433
      0
    
    
      o3_reg
      382.202
      4.43504
      2.62408
      0.0906622
      55.4651
      0.276846
      0



In [100]:

    
ax = pm.compareplot(dfComp)
ax.set_ylim(-1.1, 0.1);
ax.legend()

Predicting Outcome from New Data

A machine learning practitioner who what (s)he's doing will want to apply the model to data that has not been used in the fitting process to assess the model's skill in real world predictions. The advantage in this particular case however is that in addition to prediction, we'll get uncertainties, not really available off the shelf with scikit-learn or other machine learning libraries. Even better, we can use the now familiar posterior predictive checks approach to do just that.

As a reminder I framed the input data as a theano shared variable to allow replacing the training data with the test data. The test data consists in SeaWiFS matchup data that originates, like NOMAD, from SeaBASS. However, the data in this set was gathered after NOMAD was put together (2010 on) to avoid data leakage.

Another interesting aspect of this particular test data set is that because it's a matchup set it contains corresponding in situ and satellite measurements, which makes for an interesting performance comparison of the model.



In [101]:

    
# load test set
dfTest = pd.read_pickle('./bayesianChl_DATA/dfSwfChloraMups2010p.pkl')
dfTest.info()
dfLogTest = pd.DataFrame(columns=['id','seawifs_mxBl2Gr','insitu_chlor_a', 'seawifs_chlor_a'])
dfLogTest.id = dfTest.id
dfLogTest.seawifs_mxBl2Gr=  np.log10(dfTest.seawifs_mxBl2Gr)
dfLogTest.insitu_chlor_a = np.log10(dfTest.insitu_chlor_a)
dfLogTest.seawifs_chlor_a = np.log10(dfTest.seawifs_chlor_a)
X_test, y_test = dfLogTest.seawifs_mxBl2Gr.values, dfLogTest.insitu_chlor_a.values









    



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 59 entries, 0 to 58
Data columns (total 12 columns):
id                   59 non-null int64
latitude             59 non-null float64
longitude            59 non-null float64
date_time            59 non-null datetime64[ns]
cruise               59 non-null object
seawifs_solz         59 non-null float64
seawifs_senz         59 non-null float64
seawifs_windspeed    59 non-null float64
seawifs_tdiff        59 non-null int64
seawifs_mxBl2Gr      59 non-null float64
insitu_chlor_a       59 non-null float64
seawifs_chlor_a      59 non-null float64
dtypes: datetime64[ns](1), float64(8), int64(2), object(1)
memory usage: 5.6+ KB



In [102]:

    
# SUBSTITUTING TRAINING SET FOR TEST SET - CHANGING VALUES HERE WILL ALSO CHANGE VALUES IN THE MODEL
X_Shrd.set_value(X_test)
sig0_3Tst, sig1_3Tst = GetPPC(data=X_test, mcChain=trace_ord3, pmModel=OC4_ord3, varname='chlPred_3')
sig0_3RegTst, sig1_3RegTst = GetPPC(data=X_test, mcChain=trace_ord3_reg, pmModel=OC4_ord3_reg, varname='chlPred_3_reg')
sig0_4Tst, sig1_4Tst = GetPPC(data=X_test, mcChain=trace_ord4, pmModel=OC4_ord4, varname='chlPred_4')
sig0_4TstReg, sig1_4TstReg = GetPPC(data=X_test, mcChain=trace_ord4_reg, pmModel=OC4_ord4_reg, varname='chlPred_4')









    



100%|██████████| 1000/1000 [00:00<00:00, 1361.35it/s]
100%|██████████| 1000/1000 [00:00<00:00, 1572.40it/s]
100%|██████████| 1000/1000 [00:00<00:00, 1560.14it/s]
100%|██████████| 1000/1000 [00:00<00:00, 1586.23it/s]



In [103]:

    
bayesCoeffs_3Tst = [trace_ord3['a%d_3' %i].mean() for i in range(4)]
bayesCoeffs_3TstReg = [trace_ord3_reg['a%d_3' %i].mean() for i in range(4)]
bayesCoeffs_4Tst = [trace_ord4['a%d_4' %i].mean() for i in range(5)]
bayesCoeffs_4TstReg = [trace_ord4_reg['a%d_4' %i].mean() for i in range(5)]



In [107]:

    
f,axs = pl.subplots(nrows=2, ncols=2, figsize=(16,12),sharex=True, sharey=True)
_ = PlotPPC(X_test, y_test, sig0_4Tst, sig1_4Tst, bayesCoeffs_4Tst, ax=axs[0,0],
            title='PPC (Training Data) - Order 4', legendOn=False, xlblOn=False)
_ = PlotPPC(X_test, y_test, sig0_4TstReg, sig1_4TstReg, bayesCoeffs_4TstReg, ax=axs[0,1],
            title='PPC (Training Data) - Order 4, w/Reg',
           legendOn=False, xlblOn=False, ylblOn=False )
_ = PlotPPC(X_test, y_test, sig0_3Tst, sig1_3Tst, bayesCoeffs_3Tst, ax=axs[1,0],
            title='PPC (Training Data) - Order 3',
           legendOn=False)
_ = PlotPPC(X_test, y_test, sig0_3RegTst, sig1_3RegTst, bayesCoeffs_3TstReg, ax=axs[1,1],
            title='PPC (Training Data) - Order 3, w/Reg.', ylblOn=False)

Similarly to PPCs on in-sample data, the results of PPC on out-of-sample (test) data confirms that the simpler O(3) polynomial regression does not perform much worse than its more complex O(4) counterpart. From a more frequentist point of view, computing MSEs on frequentist OLS and Bayesian forms of both models shows a slight improvement in favor of the simpler, O(3) version.

Partial Pooling & Hierarchical Bayes

Hierarchical O(4) and O(3) Models

So far the inferences described above carry the implicit assumption that all data are the same, and as such can be pooled to calculate a single regression. This particular dataset may contradict this assumption in two ways.

The first potential violation concerns the predicted in-situ chlorophyll data, which was obtain via one of two methods; hplc or fluorescence. One approach might be to only retain data where chlorophyll is only of one type, e.g. hplc, which is expected to produce more consistent estimates, and throw out the rest of the data. This is a bad idea because the fluorescence data still carries information, no matter how noisy. Because prediction is the goal of the exercise, the algorithm should be trained with as much data as possible.

The second potential violation is related to the predictor variable. Specifically, band ratio used was computed by selecting the largest of 3 (in this particular example) band ratios. The band selection implies more complexity than what a polynomial regression can handle.

One common way to address these issues is to introduce interaction terms. However this adds undue complication to polynomial regression of order 2 or above. High-order polynomial regressions are already practically uninterpretable because of the compounding of the predictor variable. An alternative is to use a hierarchical framework. Here, the introduction of hyperpriors allow similarities between data type to be taken into account, while enabling a category-specific inference to take place.

Below, I address the two partial pooling schemes mentioned above - chlorophyll measurement method and band ratio - separately.

As a reminder, the dataset used here, the first few lines of which are displayed again below, has two columns, 'chl_id', and 'maxBlue_id'. These specify respectively the method by which chlorophyll was measured, and the band selected to compute the ratio, for each particular observation. The inferential framework used here expects numerical input so two more columns, 'chl_cat' and 'maxBlue_cat' contain corresponding integer labels.



In [243]:

    
dfLogOCx.head()









    Out[243]:







  
    
      
      id
      mxBl2Gr
      chl
      chl_id
      chl_cat
      maxBlue_id
      maxBlue_cat
    
  
  
    
      0
      1565
      -0.267867
      1.581950
      fluor
      0
      rrs510
      2
    
    
      1
      1566
      -0.220457
      1.544192
      fluor
      0
      rrs510
      2
    
    
      2
      1567
      -0.268537
      1.429914
      fluor
      0
      rrs510
      2
    
    
      3
      1568
      -0.279976
      1.680879
      fluor
      0
      rrs510
      2
    
    
      4
      1559
      -0.167733
      1.371991
      fluor
      0
      rrs510
      2

Hierarchical model: chlorophyll type

First a quick look the different categories are shown below:



In [246]:

    
dfLogOCx.chl_id.value_counts(normalize=True)









    Out[246]:





fluor    0.589567
hplc     0.410433
Name: chl_id, dtype: float64

The above shows the dataset comprising a 59%/41% fluorescence/hplc split; not exactly balanced, but not terribly skewed either



In [255]:

    
sb.set(font_scale=1.5)
g = sb.PairGrid(dfLogOCx, vars=['chl', 'mxBl2Gr'],size=5, diag_sharey=False, hue='chl_id',
               hue_kws={'cmap':["Greens_d", "Blues_d"]});
g = g.map_upper(pl.scatter,alpha=0.5)
g = g.map_diag(sb.kdeplot, lw=3)
g = g.map_lower(sb.kdeplot);
g.add_legend();
g.fig.suptitle('Band Ratio & Chl w/ chl type Categorization')
g.fig.tight_layout(pad=1.5)

The model formulation is more easily understandable visually. Below is the corresponding Kruschke diagram:

Below I create indices, chl_idx, from the chlorophyll measurement method used for corresponding observations. These I will use to inform the partial pools to which the data belong. The variable numGroups simply contains the number of groups to which the observations are allocated.



In [109]:

    
numGroups = dfLogOCx.chl_cat.unique().size
chl_idx = dfLogOCx.chl_cat.values



In [113]:

    
X_Shrd.set_value(X_train)



In [114]:

    
with pm.Model() as OC4_h4_chl:
    # Hyperpriors
    mu_a0 = pm.Normal('mu_a0', mu=0, sd=10)
    sd_a0 = pm.HalfNormal("sd_a0", sd=10)
    mu_a1 = pm.Normal('mu_a1', mu=0, sd=10)
    sd_a1= pm.HalfNormal('sd_a1', sd=10)
    mu_a2 = pm.Normal('mu_a2', mu=0, sd=10)
    sd_a2 = pm.HalfNormal('sd_a2', sd=10)
    mu_a3 = pm.Normal('mu_a3', mu=0, sd=10)
    sd_a3 = pm.HalfNormal("sd_a3", sd=10)
    mu_a4 = pm.Normal('mu_a4', mu=0, sd=10)
    sd_a4 = pm.HalfNormal("sd_a4", sd=10)
    #Priors
    a0_h4_chl = pm.Normal('a0_h4_chl', mu=mu_a0, sd=sd_a0, shape=numGroups)
    a1_h4_chl = pm.Normal('a1_h4_chl', mu=mu_a1, sd=sd_a1, shape=numGroups)
    a2_h4_chl = pm.Normal('a2_h4_chl', mu=mu_a2, sd=sd_a2, shape=numGroups)
    a3_h4_chl = pm.Normal('a3_h4_chl', mu=mu_a3, sd=sd_a3, shape=numGroups)
    a4_h4_chl = pm.Normal('a4_h4_chl', mu=mu_a4, sd=sd_a4, shape=numGroups)
    
    # Likelihood
    epsilon_h4 = pm.HalfNormal('epsilon_h4', 5)
    mu_h4_chl =  a0_h4_chl[chl_idx]  + X_Shrd * (a1_h4_chl[chl_idx] + a2_h4_chl[chl_idx] * X_Shrd +
                                                         a3_h4_chl[chl_idx] * X_Shrd**2 +
                                                         a4_h4_chl[chl_idx] * X_Shrd**3)
    chlPred_h4_chl = pm.Normal('chlPred_h4_chl', mu=mu_h4_chl, sd=epsilon_h4, observed=y_train)
    
    # Sample
    mu, sds, elbo = pm.variational.advi(n=100000)
    step = pm.NUTS(scaling=OC4_h4_chl.dict_to_array(sds)**2, is_cov=True)
    trace_h4_chl = pm.sample(2000, tune=1000, step=step, start=mu)









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
Average ELBO = -234.78: 100%|██████████| 100000/100000 [00:41<00:00, 2436.37it/s]
Finished [100%]: Average ELBO = -235.14
INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
100%|██████████| 3000/3000 [03:44<00:00, 13.34it/s]

Below, the results of the inference are displayed, with the coefficients for each category displayed. As suggested by the Dataframe displayed earlier, the "/_0" index corresponds to fluorescence and the "/_1" the index to hplc. For brevity only the the distributions of the regression coefficients are displayed.



In [116]:

    
vnames = [vn for vn in trace_h4_chl.varnames if vn[0] == 'a']
pm.plot_posterior(trace_h4_chl, varnames=vnames);

Repeat the above for the O(3) model:



In [117]:

    
with pm.Model() as OC4_h3_chl:
    # Hyperpriors
    mu_a0 = pm.Normal('mu_a0', mu=0, sd=10)
    sd_a0 = pm.HalfNormal("sd_a0", sd=10)
    mu_a1 = pm.Normal('mu_a1', mu=0, sd=10)
    sd_a1= pm.HalfNormal('sd_a1', sd=10)
    mu_a2 = pm.Normal('mu_a2', mu=0, sd=10)
    sd_a2 = pm.HalfNormal('sd_a2', sd=10)
    mu_a3 = pm.Normal('mu_a3', mu=0, sd=10)
    sd_a3 = pm.HalfNormal("sd_a3", sd=10)
    #Priors
    a0_h3_chl = pm.Normal('a0_h3_chl', mu=mu_a0, sd=sd_a0, shape=numGroups)
    a1_h3_chl = pm.Normal('a1_h3_chl', mu=mu_a1, sd=sd_a1, shape=numGroups)
    a2_h3_chl = pm.Normal('a2_h3_chl', mu=mu_a2, sd=sd_a2, shape=numGroups)
    a3_h3_chl = pm.Normal('a3_h3_chl', mu=mu_a3, sd=sd_a3, shape=numGroups)
    
    # Likelihood
    epsilon_h3 = pm.HalfNormal('epsilon_h3', 5)
    mu_h3_chl =  a0_h3_chl[chl_idx]  + X_Shrd * (a1_h3_chl[chl_idx] + a2_h3_chl[chl_idx] * X_Shrd +
                                                         a3_h3_chl[chl_idx] * X_Shrd**2)
    chlPred_h3_chl = pm.Normal('chlPred_h3_chl', mu=mu_h3_chl, sd=epsilon_h3, observed=y_train)
    
    # Sample
    mu, sds, elbo = pm.variational.advi(n=100000)
    step = pm.NUTS(scaling=OC4_h3_chl.dict_to_array(sds)**2, is_cov=True)
    trace_h3_chl = pm.sample(2000, tune=1000, step=step, start=mu)









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
Average ELBO = -225.73: 100%|██████████| 100000/100000 [00:38<00:00, 2600.38it/s]
Finished [100%]: Average ELBO = -224.96
INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
100%|██████████| 3000/3000 [01:37<00:00, 30.88it/s]



In [118]:

    
vnames = [vn for vn in trace_h3_chl.varnames if vn[0] == 'a']
pm.plot_posterior(trace_h3_chl, varnames=vnames);

Performance of the chl type-based hierarchical models

For reference I use both unregularized versions of the O(3) and O(4) models.



In [119]:

    
modNames = ['OC4_ord4', 'OC4_ord3', 'OC4_h4_chl', 'OC4_h3_chl']
models = [OC4_ord4, OC4_ord3, OC4_h4_chl, OC4_h3_chl]
traces = [trace_ord4, trace_ord3, trace_h4_chl, trace_h3_chl]
dfcomp2 = pm.compare(traces, models=models)
dfcomp2.index = [modNames[i] for i in dfcomp2.index]
dfcomp2









    



INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock






    Out[119]:







  
    
      
      WAIC
      pWAIC
      dWAIC
      weight
      SE
      dSE
      warning
    
  
  
    
      OC4_h4_chl
      359.783
      8.43136
      0
      0.664913
      58.7747
      0
      0
    
    
      OC4_h3_chl
      361.154
      7.38737
      1.37087
      0.33503
      58.5027
      0.250991
      0
    
    
      OC4_ord4
      379.578
      5.50634
      19.7948
      3.3448e-05
      56.3988
      3.84135
      0
    
    
      OC4_ord3
      380.289
      4.61867
      20.506
      2.34392e-05
      56.6251
      4.38565
      0

The above suggests that a the chlorophyll type is important in shaping the regression. Furthermore, the table suggests that a weighted ensemble average of both hierarchical models would be preferable.

Hierarchical model: max. blue band

Next is the implementation of similar hierarchical models, with the partial pooling done according to the blue band that was used to compute the band ratio. The figure below illustrates this partition in the dataset. Again, I use both O(4) and O(3) models as reference



In [162]:

    
sb.set(font_scale=1.5)
g = sb.PairGrid(dfLogOCx, vars=['chl', 'mxBl2Gr'],size=5, diag_sharey=False, hue='maxBlue_id',
                hue_kws={"cmap": ["Reds_d", "Greens_d", "Blues_d"]});
g = g.map_upper(pl.scatter,alpha=0.75)
g = g.map_diag(sb.kdeplot, lw=3)
g = g.map_lower(sb.kdeplot);
g.add_legend();

As before, before the model is built, the index to the blue band used is stored; here in mxBl_idx.



In [121]:

    
numGroups = dfLogOCx.maxBlue_cat.unique().size
mxBl_idx = dfLogOCx.maxBlue_cat.values
with pm.Model() as OC4_h4_mxBl:
    # Hyperpriors
    mu_a0 = pm.Normal('mu_a0', mu=0, sd=10)
    sd_a0 = pm.HalfNormal("sd_a0", sd=10)
    mu_a1 = pm.Normal('mu_a1', mu=0, sd=10)
    sd_a1= pm.HalfNormal('sd_a1', sd=10)
    mu_a2 = pm.Normal('mu_a2', mu=0, sd=10)
    sd_a2 = pm.HalfNormal('sd_a2', sd=10)
    mu_a3 = pm.Normal('mu_a3', mu=0, sd=10)
    sd_a3 = pm.HalfNormal("sd_a3", sd=10)
    mu_a4 = pm.Normal('mu_a4', mu=0, sd=10)
    sd_a4 = pm.HalfNormal("sd_a4", sd=10)
    #Priors
    a0_h4_mb = pm.Normal('a0_h4_mb', mu=mu_a0, sd=sd_a0, shape=numGroups)
    a1_h4_mb = pm.Normal('a1_h4_mb', mu=mu_a1, sd=sd_a1, shape=numGroups)
    a2_h4_mb = pm.Normal('a2_h4_mb', mu=mu_a2, sd=sd_a2, shape=numGroups)
    a3_h4_mb = pm.Normal('a3_h4_mb', mu=mu_a3, sd=sd_a3, shape=numGroups)
    a4_h4_mb = pm.Normal('a4_h4_mb', mu=mu_a4, sd=sd_a4, shape=numGroups)
    # Likelihood
    epsilon_h4 = pm.HalfNormal('epsilon_h4', 5)
    mu_h4_mb =  a0_h4_mb[mxBl_idx] + X_Shrd * (a1_h4_mb[mxBl_idx] + a2_h4_mb[mxBl_idx] * X_Shrd +
                                               a3_h4_mb[mxBl_idx] * X_Shrd**2 +
                                               a4_h4_mb[mxBl_idx] * X_Shrd**3)
    chlPred_h4_mb = pm.Normal('chlPred_h4_mb', mu=mu_h4_mb, sd=epsilon_h4, observed=y_train)
    #start = pm.find_MAP()
    mu, sds, elbo = pm.variational.advi(n=100000)
    step = pm.NUTS(scaling=OC4_h4_mxBl.dict_to_array(sds)**2, is_cov=True)
    trace_h4_mxBl = pm.sample(2000, tune=1000, step=step, start=mu)
    #step = pm.NUTS(scaling=start)
    #trace_h3r = pm.sample(10000, step=step, start=start)









    



Average ELBO = -233.01: 100%|██████████| 100000/100000 [00:49<00:00, 2035.73it/s]
Finished [100%]: Average ELBO = -231.12
INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
100%|██████████| 3000/3000 [05:44<00:00,  8.70it/s]

Below are the posterior plots for the regression coefficients. Note that the number at the end of the coefficent name refers to the blue band used in the regression:

0: 443
1: 490
2: 510



In [124]:

    
vnames = [vn for vn in trace_h4_mxBl.varnames if vn[0]=='a']
pm.plot_posterior(trace_h4_mxBl, varnames=vnames);

Next is to repeat the above with the O(3) model



In [126]:

    
with pm.Model() as OC4_h3_mxBl:
    # Hyperpriors
    mu_a0 = pm.Normal('mu_a0', mu=0, sd=10)
    sd_a0 = pm.HalfNormal("sd_a0", sd=10)
    mu_a1 = pm.Normal('mu_a1', mu=0, sd=10)
    sd_a1= pm.HalfNormal('sd_a1', sd=10)
    mu_a2 = pm.Normal('mu_a2', mu=0, sd=10)
    sd_a2 = pm.HalfNormal('sd_a2', sd=10)
    mu_a3 = pm.Normal('mu_a3', mu=0, sd=10)
    sd_a3 = pm.HalfNormal("sd_a3", sd=10)
    #Priors
    a0_h3_mb = pm.Normal('a0_h3_mb', mu=mu_a0, sd=sd_a0, shape=numGroups)
    a1_h3_mb = pm.Normal('a1_h3_mb', mu=mu_a1, sd=sd_a1, shape=numGroups)
    a2_h3_mb = pm.Normal('a2_h3_mb', mu=mu_a2, sd=sd_a2, shape=numGroups)
    a3_h3_mb = pm.Normal('a3_h3_mb', mu=mu_a3, sd=sd_a3, shape=numGroups)
    # Likelihood
    epsilon_h3 = pm.HalfNormal('epsilon_h4', 5)
    mu_h3_mb =  a0_h3_mb[mxBl_idx] + X_Shrd * (a1_h3_mb[mxBl_idx] + a2_h3_mb[mxBl_idx] * X_Shrd +
                                               a3_h3_mb[mxBl_idx] * X_Shrd**2)
    chlPred_h3_mb = pm.Normal('chlPred_h3_mb', mu=mu_h3_mb, sd=epsilon_h3, observed=y_train)
    #start = pm.find_MAP()
    mu, sds, elbo = pm.variational.advi(n=100000)
    step = pm.NUTS(scaling=OC4_h3_mxBl.dict_to_array(sds)**2, is_cov=True)
    trace_h3_mxBl = pm.sample(2000, tune=1000, step=step, start=mu)
    #step = pm.NUTS(scaling=start)
    #trace_h3r = pm.sample(10000, step=step, start=start)









    



Average ELBO = -226: 100%|██████████| 100000/100000 [00:38<00:00, 2623.22it/s]   
Finished [100%]: Average ELBO = -225.13
INFO (theano.gof.compilelock): Refreshing lock /home/madhatter106/.theano/compiledir_Linux-4.4--generic-x86_64-with-debian-stretch-sid-x86_64-3.5.4-64/lock_dir/lock
100%|██████████| 3000/3000 [02:38<00:00, 19.28it/s]



In [128]:

    
vnames = [vn for vn in trace_h3_mxBl.varnames if vn[0]=='a']
pm.plot_posterior(trace_h3_mxBl, varnames=vnames, );

So which model???

For brevity I'll only use the WAIC-based comparison. Since it's simple enough, I'll throw in all 8 models built so far. A more thorough analysis is certainly warranted.



In [130]:

    
traces = [trace_ord4, trace_ord3, trace_ord4_reg, trace_ord3_reg,
          trace_h4_chl, trace_h3_chl, trace_h4_mxBl, trace_h3_mxBl]
models = [OC4_ord4, OC4_ord3, OC4_ord4_reg, OC4_ord3_reg,
          OC4_h4_chl, OC4_h3_chl, OC4_h4_mxBl, OC4_h3_mxBl]
modNames = ['OC4_ord4', 'OC4_ord3', 'OC4_ord4_reg', 'OC4_ord3_reg',
            'OC4_h4_chl', 'OC4_h3_chl', 'OC4_h4_mxBl', 'OC4_h3_mxBl']
dfComp = pm.compare(traces, models)
dfComp.index = [modNames[i] for i in dfComp.index]
dfComp









    Out[130]:







  
    
      
      WAIC
      pWAIC
      dWAIC
      weight
      SE
      dSE
      warning
    
  
  
    
      OC4_h4_mxBl
      294.208
      12.5076
      0
      0.819403
      53.9082
      0
      1
    
    
      OC4_h3_mxBl
      297.233
      11.5877
      3.02462
      0.180597
      53.6337
      0.115998
      1
    
    
      OC4_h4_chl
      359.783
      8.43136
      65.5752
      4.72088e-15
      58.7747
      31.5568
      0
    
    
      OC4_h3_chl
      361.154
      7.38737
      66.9461
      2.37872e-15
      58.5027
      31.8322
      0
    
    
      OC4_ord4_reg
      378.86
      4.66388
      84.6524
      3.39987e-19
      55.7739
      23.854
      0
    
    
      OC4_ord4
      379.578
      5.50634
      85.37
      2.37481e-19
      56.3988
      24.3701
      0
    
    
      OC4_ord3
      380.289
      4.61867
      86.0812
      1.66419e-19
      56.6251
      25.7459
      0
    
    
      OC4_ord3_reg
      382.202
      4.43504
      87.9941
      6.39465e-20
      55.4651
      24.5369
      0



In [133]:

    
f, ax = pl.subplots(figsize=(12,12))
ax = pm.compareplot(dfComp, ax=ax)
ax.set_ylim(-1.1, 0.1)
ax.legend(dfcomp_plot_leg_lbls);

A few observations can be drawn from the above.

a hierarchical model is best suited because of groupable heterogeneity within the training data. Both chl- and blue band-based groupings outperform the pooled models. The chl-based partially pooled model is mostly to demonstrate the importance of methodology used to derive in-situ data. This model's dependence on in-situ measurement methodology does not make it a practical solution for predicting out-of-sample satellite data that does not necessarily have matching in-situ chlorophyll.
The blue band used for the ratio appears to overwhelmingly be the most important group index.
the blue band-based O(4) hierarchical model is predicted to perform best out of sample. However the table suggests that a weighted ensemble average of both O(4) and O(3) blue-band-based hierarchical models (approx. 82% to 18% respectively) would result in the best performance. Such an ensemble average model is more flexible and apt to deal better with the smaller number of data points corresponding to very low and very high chlorophyll concentration.
if pooled models are unavoidable, O(4) and O(3) do not appear remarkably different in their performance.



In [ ]:

	id	mxBl2Gr	chl	chl_id	maxBlue_id	maxBlue_cat
0	1565	-0.267867	1.581950	fluor	rrs510	2
1	1566	-0.220457	1.544192	fluor	rrs510	2
2	1567	-0.268537	1.429914	fluor	rrs510	2
3	1568	-0.279976	1.680879	fluor	rrs510	2
4	1559	-0.167733	1.371991	fluor	rrs510	2

	WAIC	pWAIC	dWAIC	weight	SE	dSE
o4	379.578	5.50634	0	0.336696	56.3988	0
o4_reg	379.578	5.50634	0	0.336696	56.3988	0
o3	380.289	4.61867	0.711159	0.235946	56.6251	0.303433
o3_reg	382.202	4.43504	2.62408	0.0906622	55.4651	0.276846

	WAIC	pWAIC	dWAIC	weight	SE	dSE
OC4_h4_chl	359.783	8.43136	0	0.664913	58.7747	0
OC4_h3_chl	361.154	7.38737	1.37087	0.33503	58.5027	0.250991
OC4_ord4	379.578	5.50634	19.7948	3.3448e-05	56.3988	3.84135
OC4_ord3	380.289	4.61867	20.506	2.34392e-05	56.6251	4.38565

	WAIC	pWAIC	dWAIC	weight	SE	dSE	warning
OC4_h4_mxBl	294.208	12.5076	0	0.819403	53.9082	0	1
OC4_h3_mxBl	297.233	11.5877	3.02462	0.180597	53.6337	0.115998	1
OC4_h4_chl	359.783	8.43136	65.5752	4.72088e-15	58.7747	31.5568	0
OC4_h3_chl	361.154	7.38737	66.9461	2.37872e-15	58.5027	31.8322	0
OC4_ord4_reg	378.86	4.66388	84.6524	3.39987e-19	55.7739	23.854	0
OC4_ord4	379.578	5.50634	85.37	2.37481e-19	56.3988	24.3701	0
OC4_ord3	380.289	4.61867	86.0812	1.66419e-19	56.6251	25.7459	0
OC4_ord3_reg	382.202	4.43504	87.9941	6.39465e-20	55.4651	24.5369	0